卷积神经网络在黑色素瘤和其他皮肤病变的分类中表现出人类水平的表现,但是在广泛部署之前,应解决不同肤色之间的明显性能差异。在这项工作中,我们提出了一种有效但有效的算法,用于自动标记病变图像的肤色,并使用它来注释基准ISIC数据集。随后,我们使用这些自动标签作为两种领先的偏见,无法减轻肤色偏差的目标。我们的实验结果提供了证据表明,我们的肤色检测算法优于现有的解决方案,并且脱胶肤色可以改善概括,并可以减少黑色素瘤检测到更轻和较深的肤色之间的性能差异。
translated by 谷歌翻译
卷积神经网络在皮肤病变图像分类中表现出皮肤科医生水平的表现,但是由于训练数据中看到的偏见而引起的预测不规则性是在可能在广泛部署之前解决的问题。在这项工作中,我们使用两种领先的偏见未学习技术从自动化的黑色素瘤分类管道中稳健地消除了偏见和虚假变化。我们表明,可以使用这些偏置去除方法合理地减轻先前研究中介绍的手术标记和统治者引入的偏见。我们还证明了与用于捕获病变图像的成像仪器有关的杂化变异的概括优势。我们的实验结果提供了证据,表明上述偏见的影响大大降低了,不同的偏见技术在不同的任务方面具有出色的作用。
translated by 谷歌翻译
This paper presents a novel approach to the acquisition of language models from corpora. The framework builds on Cobweb, an early system for constructing taxonomic hierarchies of probabilistic concepts that used a tabular, attribute-value encoding of training cases and concepts, making it unsuitable for sequential input like language. In response, we explore three new extensions to Cobweb -- the Word, Leaf, and Path variants. These systems encode each training case as an anchor word and surrounding context words, and they store probabilistic descriptions of concepts as distributions over anchor and context information. As in the original Cobweb, a performance element sorts a new instance downward through the hierarchy and uses the final node to predict missing features. Learning is interleaved with performance, updating concept probabilities and hierarchy structure as classification occurs. Thus, the new approaches process training cases in an incremental, online manner that it very different from most methods for statistical language learning. We examine how well the three variants place synonyms together and keep homonyms apart, their ability to recall synonyms as a function of training set size, and their training efficiency. Finally, we discuss related work on incremental learning and directions for further research.
translated by 谷歌翻译
The evaluation of abstractive summarization models typically uses test data that is identically distributed as training data. In real-world practice, documents to be summarized may contain input noise caused by text extraction artifacts or data pipeline bugs. The robustness of model performance under distribution shift caused by such noise is relatively under-studied. We present a large empirical study quantifying the sometimes severe loss in performance (up to 12 ROUGE-1 points) from different types of input noise for a range of datasets and model sizes. We then propose a light-weight method for detecting and removing such noise in the input during model inference without requiring any extra training, auxiliary models, or even prior knowledge of the type of noise. Our proposed approach effectively mitigates the loss in performance, recovering a large fraction of the performance drop, sometimes as large as 11 ROUGE-1 points.
translated by 谷歌翻译
Text-guided image editing can have a transformative impact in supporting creative applications. A key challenge is to generate edits that are faithful to input text prompts, while consistent with input images. We present Imagen Editor, a cascaded diffusion model built, by fine-tuning Imagen on text-guided image inpainting. Imagen Editor's edits are faithful to the text prompts, which is accomplished by using object detectors to propose inpainting masks during training. In addition, Imagen Editor captures fine details in the input image by conditioning the cascaded pipeline on the original high resolution image. To improve qualitative and quantitative evaluation, we introduce EditBench, a systematic benchmark for text-guided image inpainting. EditBench evaluates inpainting edits on natural and generated images exploring objects, attributes, and scenes. Through extensive human evaluation on EditBench, we find that object-masking during training leads to across-the-board improvements in text-image alignment -- such that Imagen Editor is preferred over DALL-E 2 and Stable Diffusion -- and, as a cohort, these models are better at object-rendering than text-rendering, and handle material/color/size attributes better than count/shape attributes.
translated by 谷歌翻译
This paper is a technical overview of DeepMind and Google's recent work on reinforcement learning for controlling commercial cooling systems. Building on expertise that began with cooling Google's data centers more efficiently, we recently conducted live experiments on two real-world facilities in partnership with Trane Technologies, a building management system provider. These live experiments had a variety of challenges in areas such as evaluation, learning from offline data, and constraint satisfaction. Our paper describes these challenges in the hope that awareness of them will benefit future applied RL work. We also describe the way we adapted our RL system to deal with these challenges, resulting in energy savings of approximately 9% and 13% respectively at the two live experiment sites.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
在因果推理和强盗文献中,基于观察数据的线性功能估算线性功能的问题是规范的。我们分析了首先估计治疗效果函数的广泛的两阶段程序,然后使用该数量来估计线性功能。我们证明了此类过程的均方误差上的非反应性上限:这些边界表明,为了获得非反应性最佳程序,应在特定加权$ l^2 $中最大程度地估算治疗效果的误差。 -规范。我们根据该加权规范的约束回归分析了两阶段的程序,并通过匹配非轴突局部局部最小值下限,在有限样品中建立了实例依赖性最优性。这些结果表明,除了取决于渐近效率方差之外,最佳的非质子风险除了取决于样本量支持的最富有函数类别的真实结果函数与其近似类别之间的加权规范距离。
translated by 谷歌翻译
我们提出了连续表示的时间扩展变化,我们称其为t-SR。 T-SR通过在原始动作重复序列上构造后继表示,捕获了时间扩展动作的预期状态过渡动力学。这种时间抽象的这种形式不能学习相关任务结构的自上而下的层次结构,而是对耦合动作和动作重复的自下而上的组成。这减少了在没有学习层次政策的情况下控制中所需的决策数量。因此,T-SR直接考虑了时间扩展的动作序列的时间范围,而无需预定义或域特异性选项。我们表明,在具有动态奖励结构的环境中,T-SR能够利用后继表示的灵活性和时间扩展的动作提供的抽象。因此,在一系列稀疏的网格世界环境中,T-SR最佳地适应策略远比基于可比的无模型的强化学习方法快得多。我们还表明,T-SR学到的解决这些任务的方式要求学习的策略的始终如一的频率比非临时扩展的策略少。
translated by 谷歌翻译
基于时空的图(STMAP)方法显示出为车辆轨迹重建处理高角度视频的巨大潜力,可以满足各种数据驱动的建模和模仿学习应用的需求。在本文中,我们开发了时空深嵌入(STDE)模型,该模型在像素和实例水平上施加了平等约束,以生成用于STMAP上车辆条纹分割的实例感知嵌入。在像素级别上,每个像素在不同范围的8-邻居像素进行编码,随后使用该编码来指导神经网络学习嵌入机制。在实例级别上,歧视性损耗函数被设计为将属于同一实例的像素更接近,并将不同实例的平均值分开。然后,通过静脉 - 沃特算法算法优化时空亲和力的输出,以获得最终的聚类结果。基于分割指标,我们的模型优于其他五个用于STMAP处理的基线,并在阴影,静态噪声和重叠的影响下显示出稳健性。该设计的模型用于处理所有公共NGSIM US-101视频,以生成完整的车辆轨迹,表明具有良好的可扩展性和适应性。最后但并非最不重要的一点是,讨论了带有STDE和未来方向的扫描线方法的优势。代码,STMAP数据集和视频轨迹在在线存储库中公开可用。 github链接:shorturl.at/jklt0。
translated by 谷歌翻译